Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] /streaming_chat endpoint PoC #1527

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

TamiTakamiya
Copy link
Contributor

Jira Issue: https://issues.redhat.com/browse/AAP-39044

Description

A PoC for /streaming_chat endpoint

Testing

Steps to test

  1. Pull down the PR
  2. ...
  3. ...

Scenarios tested

Production deployment

  • This code change is ready for production on its own
  • This code change requires the following considerations before going to production:

@TamiTakamiya TamiTakamiya changed the base branch from main to TamiTakamiya/AAP-39043/chatbot-streaming-ui February 10, 2025 21:12
Base automatically changed from TamiTakamiya/AAP-39043/chatbot-streaming-ui to main February 10, 2025 21:21
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch 6 times, most recently from dae29f3 to 91c8daa Compare February 14, 2025 21:35
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from e833b89 to d251b56 Compare February 18, 2025 18:28
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from dc0fd2c to 23483b5 Compare February 18, 2025 19:18
@TamiTakamiya TamiTakamiya force-pushed the TamiTakamiya/AAP-39044/streaming-chat-endpoint-poc branch from 2d9bdbf to eed802b Compare February 18, 2025 20:06
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
62.1% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Copy link
Contributor

@manstis manstis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a peek @TamiTakamiya

I don't think we need the async_invoke function as you've now moved streaming support to a new type of pipeline (ModelPipelineStreamingChatBot).

"inference_url": chatbot_service_url or "http://localhost:8000",
"model_id": chatbot_service_model_id or "granite3-8b",
"verify_ssl": model_service_verify_ssl,
"stream": False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO use of environment variables is deprecated.

ANSIBLE_AI_MODEL_MESH_CONFIG should be considered the master configuration.

I'd therefore propose:

  • Add a new line after L#182: chatbot_streaming = os.getenv("CHATBOT_STREAMING")
  • Change L#201 to "stream": chatbot_streaming
  • Revert this change.

)


StreamingChatBotResponse = Any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should be StreamingChatBotResponse = AsyncGenerator

@@ -274,6 +301,9 @@ def alias() -> str:
def invoke(self, params: PIPELINE_PARAMETERS) -> PIPELINE_RETURN:
raise NotImplementedError

def async_invoke(self, params: PIPELINE_PARAMETERS) -> AsyncGenerator:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method isn't needed (yes, yes, I know I proposed it.. but I found a better way.. I think)

def __init__(self, config: HttpConfiguration):
super().__init__(config=config)

def invoke(self, params: StreamingChatBotParameters) -> StreamingHttpResponse:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change L#222-L#225 to simply be:

async def invoke(self, params: StreamingChatBotParameters) -> StreamingChatBotResponse:

i.e. there is no need for an invoke AND async_invoke function; just an invoke function.

},
summary="Streaming chat request",
)
def post(self, request) -> Response:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the return type be StreamingHttpResponse?

self.event.modelName = self.req_model_id or self.llm.config.model_id

return StreamingHttpResponse(
self.llm.async_invoke(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my other changes are integrated; this can simply self.llm.invoke(....)

@@ -159,7 +159,7 @@ def assert_basic_data(
self.assert_common_data(data, expected_status, deployed_region)
timestamp = data["timestamp"]
dependencies = data.get("dependencies", [])
self.assertEqual(10, len(dependencies))
self.assertEqual(11, len(dependencies))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI, there's also a test in ansible-wisdom-testing that will break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants